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We introduce a fast implementation of the pivot algorithm for self-avoiding walks, which we use to obtain 
large samples of walks on the cubic lattice of up to 33 x 10 6 steps. Consequently the critical exponent v for 
three-dimensional self-avoiding walks is determined to great accuracy; the final estimate is v = 0.587 597(7). 
The method can be adapted to other models of polymers with short-range interactions, on the lattice or in the 
continuum. 
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The self-avoiding walk (SAW) model is an important model 
in statistical physics [1]. It models the excluded-volume effect 
observed in real polymers, exactly capturing universal fea- 
tures such as critical exponents. It is also the n limit of 
the n-vector model, which includes the Ising model (n = 1) 
as another instance, thus serving as an important model in the 
study of critical phenomena. Exact results are known for self- 
avoiding walks in two dimensions [2, 3] and for d > 4 (mean- 
field behavior has been proved for d > 5 [4]), but not for the 
most physically interesting case of d = 3. 

We have efficiently implemented the pivot algorithm via a 
data structure we call the SAW-tree, which allows rapid Monte 
Carlo simulation of SAWs of millions of steps. We discuss 
this implementation in general terms here, and then use this 
implementation to accurately calculate the critical exponent v 
for Z 3 . More details about the implementation can be found 
in a companion article [5]. This new algorithm can also be 
adapted to other models of polymers with short-range interac- 
tions, on the lattice and in the continuum, and hence promises 
to be widely useful. 

An A-step SAW on 7L d is a mapping u : {0, 1, . . . , N} -t 
Z d with + 1) — = 1 for each i (\x\ denotes the 

Euclidean norm of x), and with uj(i) ^ ui(j) for all i ^ j. 
We generate three-dimensional SAWs via the pivot algorithm, 
and calculate various observables which characterize the size 
of the SAWs: the squared end-to-end distance i? 2 , the squared 
radius of gyration i? 2 , and the mean-square distance of a 
monomer from its endpoints Rf^, where 

i? c 2 = MAO-c(o)| 2 , 
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We seek to calculate the mean values of these observables over 
all SAWs of N steps, where each SAW is given equal weight. 
Their asymptotic forms are expected to be described by 
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with < Ai < A 2 < • • ■ , and where additional terms of 
the form c / N ka+k ^+ k ^+ k ^+- (k Q , k u k 2 ,k 3 , ■ • • > 0) 
are not shown. In addition, af indicates terms arising from 
the anti-ferromagnetic singularity, which occurs in models on 
loose-packed lattices such as 7L d ; these terms are negligible 
compared with terms included in fits. The exponents v, A l7 
and A2 are universal, i.e. they are dependent only on the di- 
mensionality of the lattice and the universality class of the 
model, while the amplitudes D x are observable dependent. 
However, amplitude ratios, such as D g /D c and b g /b e , are uni- 
versal quantities. 

The pivot algorithm is a powerful approach to the study of 
self-avoiding walks, invented by Lai [6] and later elucidated 
and popularized by Madras and Sokal [7]. From an initial 
SAW of length N, such as a straight rod, new A-step walks 
are successively generated by choosing a site of the walk at 
random, and attempting to apply a lattice symmetry opera- 
tion, or pivot, to one of the parts of the walk; if the result- 
ing walk is self-avoiding the move is accepted, otherwise the 
move is rejected and the original walk is retained. The group 
of lattice symmetries for 1? has 48 elements, and we use all 
of them except the identity as potential pivot operations; other 
choices are possible. Thus a Markov chain is formed in the en- 
semble of SAWs of fixed length; this chain satisfies detailed 
balance and is ergodic, ensuring that SAWs are sampled uni- 
formly at random. Furthermore, as demonstrated by Madras 
and Sokal [7] through strong heuristic arguments and numeri- 
cal experiments, the Markov chain has a short integrated auto- 
correlation time for global observables, thus making the pivot 
algorithm extremely efficient in comparison to Markov chains 
utilizing local moves. See [7, 8] for detailed discussion. 

The implementation of Madras and Sokal utilized a hash 
table to record the location of each site of the walk. They 
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showed that the pivot algorithm has integrated autocorrelation 
time 0(N P ), with p dimension-dependent but close to zero 
(p < 0.2), and argued convincingly that the CPU time per 
successful pivot is O(N) for their implementation. 

Madras and Sokal argued that O(N) is best possible be- 
cause it takes time of order N to merely write down an A-step 
SAW. However, Kennedy [9] recognized that it is not neces- 
sary to write down the SAW for each successful pivot, and 
from clever use of geometric constraints developed an algo- 
rithm that broke the 0(N) barrier. The CPU time for this 
implementation grows as a dimension-dependent fractional 
power of N (see Table I). 

METHOD 

We have extended this idea to obtain a radical further im- 
provement: for Z 2 and Z 3 the mean CPU time per attempted 
pivot, which we denote T(N), is now only 0(log N) for the 
range of N studied, and we have a theoretical argument that 
the large N behavior is 0(1). The key observation is that 
although there are typically O(N) nearest neighbor contacts 
for a SAW of length N, the number of contacts between two 
halves of a SAW is typically 0(1), as shown via renormal- 
ization group [10] and Monte Carlo [11] methods. When we 
attempt to pivot part of a SAW, it is guaranteed that each of the 
two sub-walks remain self-avoiding, and hence we only need 
to determine if the sub-walks intersect. If the resulting walk 
is self-avoiding, then we expect, on average, that there will be 
a constant number of contacts between the two sub-walks. 

We will now briefly discuss the relevant data structure and 
algorithms; full details can be found in [5]. We implement 
a binary tree data structure (see e.g. [12]) which we call a 
SAW-tree. The root node of the SAW-tree contains informa- 
tion about the whole walk, including R%, Jig, i?^, and its min- 
imum bounding box, which is the smallest rectangular prism 
with faces of the form Xi = c which completely contains the 
walk. The two children of the root node are valid SAW-trees, 
and contain bounding box information for the first and second 
halves of the SAW, etc., until the leaves of the tree store indi- 
vidual sites. The SAW-tree is related to the R-tree [13], a data 
structure used in the field of computational geometry, but with 
additional information encoding the state of the SAW. Thus 
far the SAW-tree has been implemented for 1 d , but can be 
straightforwardly adapted to other lattices and the continuum, 
as well as other polymer models with short-range interactions. 
To guarantee optimal performance, we implement the SAW- 
tree so that it is balanced, i.e. so that the depth never exceeds 
some fixed constant times log N. We define the level of a node 
as the number of generations between a node and the leaves. 

Bounding boxes enable us to rapidly determine if two sub- 
walks intersect after a pivot attempt: if two bounding boxes 
do not intersect, then the sub-walks which they contain can- 
not intersect. If a pivot attempt is successful, then it is nec- 
essary to resolve all intersections between bounding boxes of 
different nodes in the tree on opposite sides of the pivot site. 



Our implementation ensures that intersection tests are typi- 
cally performed between bounding boxes of nodes which are 
at the same level. We argue in [5] that the nodes at fixed level 
in the SAW-tree form a renormalized walk, and the intersec- 
tions between bounding boxes correspond to contacts in the 
original walk. This implies that at each level there are O(l) 
intersections, and as the tree has O (log AT) levels this leads 
to the conclusion that a successful pivot takes time 0(log N). 
Successful pivots occur with probability 0(N~ P ), so overall 
mean time spent on successful pivots is 0{N~ P log N). When 
a pivot attempt is unsuccessful, with high probability the first 
intersection occurs near the pivot site. Thus only a small frac- 
tion of the SAW-tree needs to be traversed to find the inter- 
section, and we argue in [5] that this takes mean time O(l). 
Unsuccessful pivots occur with probability O(l), and so the 
overall behavior is T(N) = 0(N~ P log N + 1) = O(l). In 
Fig. 1 we show T(N) for Z 2 and Z 3 from a separate data run, 
with maximum length N = 2 28 - 1 w 2.68 x 10 8 . In both 
cases it is apparent there is a crossover due to the shorter la- 
tency of cache versus main memory. In [5] we argue that O(l) 
behavior may be reached only for very large N, which makes 
interpretation of Fig. 1 difficult. For Z 2 some curvature is vis- 
ible, and the trend appears consistent with T(N) approaching 
a constant for sufficiently large N. The exponent p is smaller 
for Z 3 (p w 0.11) compared with Z 2 (p ss 0.19); hence, the 
approach to a constant is far slower, and in fact almost no cur- 
vature is visible for Z 3 . We believe the numerical evidence 
provides a strong case that T(N) is at most 0(log N), and is 
consistent with T(N) = O(l); see [5] for more details. 

T(N) is shown for the various implementations in Table 
I. For SAWs of length N = 10 6 on the cubic lattice, the 
performance gain for our implementation is approximately 
200 when compared with Kennedy's, and over a thousand 
when compared with that of Madras and Sokal [21]. The 
dramatic performance gain from the new implementation not 
only makes it possible to obtain large samples of walks with 
millions of steps, it also makes the regime of very long walks, 
of up to 10 9 steps, accessible to computer experiments. 

For SAWs of length N, it is expected that the exponential 
autocorrelation time is approximately O(Nff) [7], where / 
is the fraction of pivot attempts which are successful. The 
first 2QN/f configurations were discarded, ensuring that for 
all practical purposes SAWs were sampled from the uniform 
distribution. Batch estimates of (i? 2 ) 33554431, using a batch 
size of 10 s , are shown in Sec. 4 of [5]; in this case the first 
50 batches were discarded, while initialization bias is visually 
apparent for (at most) the first 10 batches. 

The computer experiment was performed on a cluster of 



TABLE I: T(N), mean time per attempted pivot for iV-step SAWs. 
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FIG. 1: T(N) for Z 2 and Z 3 . Note that these estimates were ob- 
tained in a separate data run on a different computer from the main 
experiment, with lengths from N = 2 7 - 1 to N = 2 28 - 1. 



AMD Opteron Barcelona 2.3GHz quad core processors, for a 
total of 16500 CPU hours. Code was written in C, and com- 
piled with gcc. 10 11 pivot attempts were made on SAWs of 
length ranging from 15 to 3.36 x 10 7 , for a grand total of 
1.89 x 10 13 pivot attempts. Data were collected from ev- 
ery pivot attempt for / and the Euclidean-invariant moments 

Rf = {Rlf with x € {e,g,m},l < k < 5 [22]. The 
longest walks with N = 3.36 x 10 7 required 3GB of mem- 
ory; much longer walks could conceivably be simulated in 
the future. By comparing fits from the whole data set {N < 
3.36 x 10 7 ) with fits from areduced data set (N < 2.1 x 10 6 ), 
we confirmed that data from the longest walks were indeed 
highly useful in tying down the various estimates (see Fig. 1 
in [14]). However, the greatest benefit from the simulation of 
truly long walks, of say 10 9 steps, may be the ability to di- 
rectly simulate properties of realistic systems, such as DNA 
knotting, rather than determination of universal parameters. 

Monte Carlo estimates of global parameters are collected 
in Tables II- V, Sec. 2 of [14], with confidence intervals calcu- 
lated using the standard binning technique. For all lengths 
studied the integrated autocorrelation time of the Markov 
chain is much less than the batch size of 10 8 . We confirm 
the accuracy of the confidence interval estimates by studying 
the effect of batch size in Sec. 5 of [5]. 



ANALYSIS 

We estimated the critical exponents v and Ai sw 1/2 and 
associated amplitudes D x and b x by fitting the leading term 
and leading correction of Eq. (1) via weighted non-linear re- 
gression. We truncated the data set by requiring A~ > N m i n , 
with N m [ n a free parameter. We shifted the value of A" of 
Eq. 1 by an amount SN X to obtain smoother convergence by 
altering the sub-leading corrections (see e.g. [15]); estimates 
for v, Ai, D x , and b x are unaffected in the limit N m i n — > oo. 



(R 2 X ) N = D X (N + SN X ) 
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Unfortunately we cannot fit the next-to-leading corrections 
with exponents 1, A 2 ~ 1, and 2Ai w 1 as the differences be- 
tween them are far too small to resolve. For sufficiently large 
Amin we found that reduced \ 2 values for all fits approached 
1 from above, indicating Eq. 2 is asymptotically correct. 

Final estimates of parameters have been made directly from 
Figs. 2 and 3, combining multiple sources visually in an at- 
tempt to make estimates robust, and allow the reader to criti- 
cally evaluate our final results. We do not distinguish between 
subjective and statistical errors, as we believe that in this con- 
text the distinction is itself quite subjective [23]. We provide 
here some guidance for the interpretation of Figs. 2 and 3, and 
refer the interested reader to [16] for (much) more information 
on series analysis. 

• We plot estimates against N~f , where y is chosen such 
that N~f is of the same order as the residual error from 
the fit. The estimates for Ai and b x have y = 1 — Ai w 
0.47, and y = 1 for v and D x . 

• We seek to extrapolate the fits to A^n = oo, or N~f = 
0. Depending upon whether the true value y C xact is l ess 
than, equal to, or greater than y, the estimates would 
approach a limiting value at N~^ n — with infinite, 
finite, or zero slope respectively. 

• Successive estimates are highly correlated, and so any 
trend which lies within the error bars should be disre- 
garded. Only some of the error bars are plotted in order 
to reduce visual noise. 

• There are no bounds on the errors of the truncated 
asymptotic formulae, and hence the interpretation of the 
graphs is subjective. The underlying systematic error is 
observable dependent, and so combining estimates from 
a variety of observables improves robustness. 

In Fig. 2 we plot estimates of Ai with our final result plot- 
ted at 0; the error bar reflects the scatter between observables. 
In Fig. 3 we plot estimates for v, biased with the lower and 
upper limits of our range for Ai, with our final result at 0. 
Similar plots for the amplitudes are given in Figs. 5 and 6 of 
[14]. We have conservatively chosen the error bar for the fi- 
nal result to encompass estimates from all observables (R x ). 
As the amplitudes D x are highly correlated with estimates for 
Ai, the biasing of Ai greatly extends the range over which 
stable fits can be obtained. This is the reason the biased fits 
are preferred over the unbiased fits shown in Fig. 3 of [14]. 



RESULTS 

We report our final results in Table II, and in addition 
we have D m = 0.58687(12), b c = -0.49(5), b g = 
-0.1125(125), and b m = -0.295(30). If one assumes the 
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hyperscaling relation dv = 2 — a, then one also obtains 
a = 0.237209(21). The estimates of v, D c , and D g are in 
accordance with previous results, although considerably more 
accurate. The estimate for Ax is more accurate than the pre- 
vious Monte Carlo value [8], but less accurate than the Monte 
Carlo renormalization group estimate of [17], which relies 
upon an uncontrolled although accurate approximation. The 
claimed accuracy of the field theory estimates [18] for Ai is 
also comparable, but as discussed by Li et al. [8] these cal- 
culations have underlying systematic errors of uncertain mag- 
nitude. Any desired amplitude ratios can be calculated from 
the amplitude estimates. The rational number with smallest 
denominator within 3 standard deviations of v — 0.587597 is 
161/274, suggesting that v cannot be expressed as a rational 
number with small denominator. 



TABLE II: Comparison of parameter estimates. 



Source" 



This Letter 0.587597(7) 0.528(12) 1.22035(25) 0.19514(4) 



[15]° Series 0.58774(22) 

[19] MC 0.5874(2) 

[20] c Series 0.58755(55) 1.225 

[18]FTd = 3 0.5882(11) 0.478(10) 

[18] FT e be 0.5878(11) 0.486(16) 

[17]MCRG 0.58756(5) 0.5310(33) 



1.2178(54) 



[8]" MC 



0.5877(6) 0.56(3) 



1.21667(50) 0.19455(7) 



"Abbreviations: MC = Monte Carlo, FT = Field theory, d = 3 = d = 3 
expansion, e be = e expansion with boundary conditions, MCRG = Monte 
Carlo renormalization group. 

*Using Eqs. (74) and (75) with 0.516 < Ai < 0.54. 

c No error estimates were made in [20], but estimates for v were in the range 
0.5870 < v < 0.5881. 

d \n addition b e = -0.483(39), b g = -0.1143(47). D c , D g , b e , and 
fe g estimates were biased with v = 0.5877, Ai = 0.56; the confidence 
intervals were not intended to be taken seriously. 



We would like to stress that, due to the neglect of sub- 
leading terms, there are underlying systematic errors in our 
estimates which are not and cannot be controlled. We have 
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FIG. 2: Estimates for Ai. 
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FIG. 3: Estimates for v from fits with biased Ai. We show the en- 
velope of maximum and minimum values of endpoints of error bars 
for all observables. 



the luxury of high quality data from long walks, and have at- 
tempted to be conservative with our claimed errors, but ac- 
knowledge there is a risk that the (subjective) confidence in- 
tervals may not be sufficiently large. 



CONCLUSION 

In summary, an efficient version of the pivot algorithm for 
SAWs has been implemented and used to calculate v\ the al- 
gorithms developed promise to be widely useful in the Monte 
Carlo simulation of SAWs and related models of polymers. 
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